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CN ' Abstract 

D ' The goal of this paper is to provide a rigorous information-theoretic analysis of subnetworks of interference 

[JL ' 

^^ ' networks. We prove two coding theorems for the compound multiple-access channel with an arbitrary number of 

channel states. The channel state information at the transmitters is such that each transmitter has a finite partition of 

the set of states and knows which element of the partition the actual state belongs to. The receiver may have arbitrary 

channel state information. The first coding theorem is for the case that both transmitters have a common message and 

C/2 , that each has an additional common message. The second coding theorem is for the case where rate-constrained, but 

noiseless transmitter cooperation is possible. This cooperation may be used to exchange information about channel 



m 



a\ 



in 

o 

o 



state information as well as the messages to be transmitted. The cooperation protocol used here generalizes Willems' 



^ ' conferencing. We show how this models base station cooperation in modern wireless cellular networks used for 



interference coordination and capacity enhancement. In particular, the coding theorem for the cooperative case shows 



OA ' how much cooperation is necessary in order to achieve maximal capacity in the network considered. 

o 



Index Terms 
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^ ■ I. Introduction 

^ . A. Motivation 

In modern cellular systems, interference is one of the main factors which limit the communication capacity. 
In order to further enhance performance, methods to better control interference have recently been investigated 
intensively. One of the principal techniques to achieve this is cooperation among neighboring base stations. This will 
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be part of the forthcoming LTE-Advanced cellular standard. It is seen as a means of achieving the desired spectral 
efficiency of mobile networks. In addition, it may enhance the performance of cell-edge users, a very important 
performance metric of future wireless cellular systems. Finally, fairness issues are expected to be resolved more 
easily with base station cooperation. 

In standardization oriented hterature, the assumptions generally are very strict. The cooperation backbones, i.e. 
the wires linking the base stations, are assumed to have infinite capacity. Full channel state information (CSI) is 
assumed to be present at all cooperating base stations. Then, multiple-input-multiple-output (MIMO) optimization 
techniques can be used for designing the system ifTOl . However, while providing a useful theoretical benchmark, the 
results thus obtained are not accepted by the operators as reliably predicting the performance of actual networks. 

In order to obtain a more realistic assessment of the performance of cellular networks with base station coop- 
eration, the above assumptions need to be adapted to reality. First, it is well-known that one cannot really assume 
perfect CSI in mobile communication networks. Second, glass fibers or any medium used for the backbones never 
have infinite capacity. The assumption of finite cooperation capacity will also lead to a better understanding of 
the amount of cooperation necessary to achieve a certain performance. Vice versa, we would like to know which 
capacity can be achieved with the backhaul found in heterogeneous networks using microwave, optical fibers and 
other media. Such insights would get lost when assuming infinite cooperation capacity. 

The question arises how much cooperation is needed in order to achieve the same performance as would be 
achievable with infinite cooperation capacity. For general interference networks with multiple receivers, the analysis 
is very difficult. Thus it is natural to start by taking a closer look at component networks which together form a 
complete interference network. Such components are those subnetworks formed by the complete set of base stations, 
but with only one receiving mobile. Then there is no more interference, so one can concentrate on finding out by 
how much the capacity increases by limited base station cooperation. This result can be seen as a first step towards 
a complete rigorous analysis of general interference networks. 

A situation which is closely related can be phrased in the cooperation setting as well. Usually, there is only one 
data stream intended for one receiver. Assume that a central node splits this data stream into two components. Each 
of these components is then forwarded to one of two base stations. Using the cooperation setting, one can address 
the question how much overhead needs to be transmitted by the splitter with the data component, i.e. how much 
information about the data component and the CSI intended for one base station needs to be known at the other 
base station in order to achieve a high, possibly maximal data rate. 

In ifm . the cooperation of base stations in an uplink network is analyzed. A turbo-like decoding scheme is 
proposed. Different degrees of cooperation and different cooperation topologies are compared in numerical simu- 
lations. In im, work has also been done on the practical level to analyze cooperative schemes. The implementation 
of a real-time distributed cooperative system for the downlink of the fourth-generation standard UTE-Advanced was 
presented. In that system, the channel state information (CSI) at the transmitters was imperfect, the limited-capacity 
glass fibers between the transmitting base stations were used to exchange CSI and data information. A feeder 
distributed the data among the transmitting base stations. 
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A question which is not addressed in this work but which will be considered in the future is what rates can be 
achieved if there are two networks as described above which belong to different providers and which hence do not 
jointly optimize their coding, to say nothing of active cooperation. In that case, uncontrolled interference heavily 
disturbs each network, and challenges different from those considered here need to be faced by the system designer. 

B. Theory 

The rigorous analysis of such cellular wireless systems as described above using information-theoretic methods 
should provide useful insights. The ultimate performance limits as well as the optimal cooperation protocols can be 
derived from such an analysis. The first information-theoretic approach to schemes with cooperating encoders goes 
back to Willems ||20l , 11211 long before this issue was relevant for practical networks. For that reason, it was not 
considered much in the next two decades. Willems considers a protocol where before transmission, the encoders 
of a discrete memoryless Multiple Access Channel (MAC) may exchange information about their messages via 
noiseless finite-capacity links (one in each direction). This may be done in a causal and iterative fashion, so the 
protocol is called a conferencing protocol. 

For the reasons mentioned at the beginning, Willems' conferencing protocol has attracted interest in recent years. 
Gaussian MACs using Willems conferencing between the encoders were analyzed in (Jj) and [19]. Moreover, in 
these two works, it was shown that interference which is known non-causally at the encoders does not reduce 
capacity. For a compound MAC, both discrete and Gaussian, with two possible channel reahzations and full CSI 
at the receiver, the capacity region was found in lfT2l . In the same paper, the capacity region was found for the 
interference channel if only one transmitter can send information to the other (unidirectional cooperation) and if 
the channel is in the strong interference regime. Another variant of unidirectional cooperation was investigated in 
lfT6l . where the three encoders of a Gaussian MAC can cooperate over a ring of unidirectional links. However, only 
lower and upper bounds were found for the maximum achievable equal rate. 

Further literature exists for Willems conferencing on the decoding side of a multi-user network. For degraded 
discrete broadcast channels, the capacity region was found in 161 if the receivers can exchange information about 
the received codewords in a single conference step. For the general broadcast and multicast channels, achievability 
regions were determined. For the Gaussian relay channel, the dependence of the performance on the number of 
conferencing iterations between the receiver and the relay was investigated in |13|. For the Gaussian Z-interference 
channel, outer and inner bounds to the capacity region where the decoders can exchange information about the 
channel outputs are provided in [7]. Finally, for discrete and Gaussian memoryless interference channels with 
conferencing decoders and where the senders have a common message, ifTSl determines achievable regions. Exact 
capacity regions are determined if the channel is physically degraded. If the encoders can conference instead of 
having a common message, the situation is the same. 

The discrete MAC with conferencing encoders is closely related to the discrete MAC with common message. 
Intuitively, the messages exchanged between the encoders in the cooperative setting form a common message, so 
the results known for the corresponding non-cooperative channel with common message can be applied to find the 
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achievable rates of the cooperative setting. This transition was used in 1201 . ||2T]| . ||3l . |fT9l , and lfT2l . The capacity 
region of the MAC with common message was determined in QJI, a simpler proof was found in EOl . 

The goal of this paper is to generaUze the original setting considered by Willems even further. We treat a 
compound discrete memoryless MAC with an arbitrary number of channel reaUzations. The receiver's CSI (CSIR) 
may be arbitrary between full and absent. The possible transmitter's CSI (CSIT) may be different from CSIR and 
asymmetric at the two encoders. It is restricted to a finite number of instances, even though the number of actual 
channel realizations may be infinite. For this channel, we consider two cases. First, we characterize the capacity 
region of this channel where the transmitters have a common message. Then, we determine the capacity region of 
the channel where there is no common message any more. Instead, the encoders have access to the output of a rate- 
constrained noiseless two-user MAC. Each input node of the noiseless MAC corresponds to one of the transmitters 
of the compound MAC. Each input to the noiseless MAC consists of the pair formed by the message which is 
to be transmitted and the CSIT present at the corresponding transmitter This generalizes Willems' conferencing 
to a non-causal conferencing protocol, where the conferencing capacities considered by Willems correspond to the 
rate constraints of the noiseless MAC in the generalized model. It turns out that this non-causal conferencing does 
not increase the capacity region, and as in ||20| , 11211 . every rate contained in the capacity region can be achieved 
using a one-shot Willems "conference". We determine how large the conferencing capacities need to be in order to 
achieve the full-cooperation sum rate and the full-cooperation capacity region, respectively. The latter is particularly 
interesting because it shows that forming a "virtual MIMO system" as mentioned in Subsection II- Al and considered 
in ifTOI does not require infinite cooperation capacity. 

C. Organization of the Paper 

In Section ini we address the problems presented above. We present the two basic channel models underlying our 
analysis: the compound MAC with common message and partial CSI and the compound MAC with conferencing 
encoders and partial CSI. We also introduce the generalized conferencing protocol used in the analysis of the 
conferencing MAC. We state the main results concerning the capacity regions of the two models. We also derive 
the minimal amount of cooperation needed in the conferencing setting in order to achieve the optimal (i.e. full- 
cooperation) sum rate and the optimal, full-cooperation rate region. The achievability of the rate regions claimed 
in the main theorems is shown in Section Unl The weak converses are shown in Section |IV] Only the converse for 
the conferencing MAC is presented in detail, because the converse for the MAC with common message is similar 
to part of the converse for the MAC with conferencing encoders. We address the application of the MAC with 
conferencing encoders to the analysis of cellular systems where one data stream is split up and sent using different 
base stations in Section [V] In the same section, in a simple numerical example, the capacity regions of a MAC with 
conferencing encoders is plotted for various amounts of cooperation. In the final section, we sum up the paper and 
discuss the directions of future research. In the Appendix several auxiliary lemmata concerning typical sequences 
are collected. 
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D. Notation 



For real numbers a and h, we set a A 5 := min(a, h) and aV b := niax(a, b). 

For any positive integer m, we write [1, m] for the set {1, . . . , m}. The complement of a set F C A:' in A:' is 
denoted by F'^. The function If is the indicator function of F, i.e. 1_f(x) equals 1 if x e i^ and else. For a 
set E C X X y, we write E\y :— {x e X : {x,y) e X x y}. For a mapping f : X ^^ y, define |1/|1 to be the 
cardinality of the range of /. 

Denote the set of probability measures on a discrete set X by ViX). The n-fold product of a p E V{X) is 
denoted by p" e P{X"). By IC(y\X), we denote the set of stochastic matrices with rows indexed by X and 
columns indexed by y. The n-fold memoryless extension of a W E IC{y\X) is defined as 

n 

Ty"(y|x) := H W{y,n\xm), 

rn— 1 

where x= {xi,...,Xn) e A'",y = (yi,...,2/„) G 3^". 

Let A" be a finite set. For x = (xi, . . . , x„) e X"^, define the type px G ViX) of x by npx(a^) = |{^ : a^i = 2;}|- 
For (5 > and p G ViX), define T"^ to be the set of those x G A"" such that \px{x) — p{x)\ < 6 for all x and 
such that Px{x) = if p{x) = 0. 

II. Channel Model and Main Results 

A. The Channel Model 

Let A", y, Z be finite sets. A compound discrete memoryless MAC with input alphabets X and y and output 
alphabet Z is determined by a set of stochastic matrices W C 1C{Z\X x y). W may be finite of infinite. Every 
W E W corresponds to a different channel state, so we will also call the elements W the states of the compound 
MAC yV. The transmitter using alphabet X will be called transmitter (sender, encoder) 1 and the transmitter with 
alphabet y will be called transmitter (sender, encoder) 2. If transmitter 1 sends a word x = (xi, . . . ,x„) G A"" 
and transmitter 2 sends a word y = (j/i, . . . , 2/„) G y", and if the channel state is W E W, then the receiver will 
receive the word z = {zi, . . . , Zn) E Z" with probability 

n 

W"(z|x,y):= Yl W{Zra\Xm,Vm)- 

rn—1 

The compound channel model does not include a change of state in the middle of a transmission block. 

The goal is to find codes that are "good" (in a sense to be specified later) universally for all those channel states 
which might be the actual one according to CSI. In our setting, CSI at sender i' is given by a finite CSIT partition 

U = {Wr^ C W : T, G T,} (1) 

for 1/ = 1,2. The sets Ti, T2 are finite, and the Wr^ satisfy 

(J Wr^ = yy, and Wr, nWri = if ^"^ ^ ^^ 
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Before encoding, transmitter v knows which element of the partition the actual channel state is contained in, i.e. 
if V7 e y^T^ is the channel state, then it knows Tjj. With this knowledge, it can adjust its codebook to the channel 
conditions to some degree. For r = (ri, T2) G Ti x T^, we denote by 

the set of channel states which is possible according to the combined channel knowledge of both transmitters. Note 
that every function from W into a finite set induces a finite partition as in ^, so this is a very general concept of 
CSIT. At the receiver side, the knowledge about the channel state is given by a not necessarily finite CSIR partition 

r = {Wp C W : /9 e i?}. (2) 

R is an arbitrary set and the sets YJp satisfy 

U Wp = W and Wp n Wp- = if pi^ p' . 

peR 

If the channel state is W £ Wp, then the receiver knows p. Thus it can adjust its decision rule to this partial channel 
knowledge. This concept includes any kind of deterministic CSIR, because any function from W into an arbitrary 
set induces a partition as in (|2]l. Note that if W is infinite, the transmitters can never have full CSI, whereas this 
is possible for the receiver if r = {{W^} : W G W}. 

Definition 1. The compound discrete memoryless MAC W together with the CSIT partitions ii,i2 and the CSIR 
partition r is denoted by the quadruple {'W,ti,t2,r). 

Example 1. There are several communication situations which are appropriately described by a compound MAC. 
One case is where information is to be sent from two transmitting terminals to one receiving terminal through a 
fading channel. If the channel remains constant during one transmission block, one obtains a compound channel. 
Usually, CSIT is not perfect. It might be, however, that the transmitters have access to partial CSI, e.g. by using 
feedback. This will not determine an exact channel state, but only an approximation. Coding must then be done in 
such a way that it is good for all those channel realizations which are possible according to CSIT. 

Another situation to be modeled by compound channels occurs if there are two transmitters each of which would 
like to send one message to several receivers at the same time. The channels to the different receivers differ from 
each other because all the terminals are at different locations. Now, the following meaning can be given to the 
above variants of channel knowledge. If CSIT is given as r = (ti,T2), this describes that the information is not 
intended for all receivers, but only for those contained in Wt- Knowledge about the intended receivers may be 
asymmetric at the senders. If every receiver has its own decoding procedure, full CSIR (i.e. r — {{W} : W £ W}) 
would be a natural assumption. If the receivers must all use the same decoder, there is no CSIR. Non-trivial CSIR 
could mean that independently of the decision at the transmitters where data are to be sent (modeled by CSIT), a 
subset of receivers is chosen as the set which the data are intended for without informing the transmitters about 
this decision. 
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Fig. 1 . The MAC with Common Message 

B. The MAC With Common Message 

Let the channel (W,ti,i2,?') be given. We now present the first of the problems treated in this paper, the 
capacity region of the compound MAC with common message. It is an interesting information-theoretic model in 
itself. However, its main interest, at least in this paper, is that it provides a basis for the solution of the problem 
presented in the next section, which is the capacity region of the compound MAC with conferencing encoders. 

Assume that each transmitter has a set of private messages [1, My], v = 1,2, and that both transmitters have an 
additional set of common messages [1. A/q] for the receiver (Fig. [T]!. Let n be a positive integer. 

Definition 2. A codecM(?^! ^Oj ^^ij M2) is a triple (/i, /2, <&) of functions satisfying 

/i:[l,Mo] x[l,Mi]xTi^A'", 
/2:[l,Mo] x[l,M2]xT2^3^", 
$ : Z" X i? ^ [1, Mo] X [1, Ml] X [1, M2]. 

n is called the blocklength of the code. 

Remark 1. Clearly, the codescM('T^, Mq, Mi, M2) are in one-to-one correspondence with the families 

{Wj,y[^J^-fe) : (i,J,fc) G [l,A/o] X [l,Mi] X [l,M2],(ri,r2,p) GTi XT2 xi?}, (3) 

where x[] e A"', y[^ £ 3^", and where the F^".^ C Z" satisfy 

^^■.nF/^,,,,=0 if (z,j,A;)/(z',/,fc')- 
(The sets F^-, are obtained from $ by setting 



J^^, :-{zeZ":cl>(z,p) = (z,j,fc)}.) 



In the following, we will use the description of codescM as families as in (O. The functional description of codes 
will be of use when we are dealing with transmitter cooperation. We say more on that in Remark [3] 
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The x[j and yj| are the codewords and the F/'^, are the decoding sets of the code. Let the transmitters have the 
common message i. Suppose that transmitter 1 additionally has the private message j and knows that W G Wn- 
Then it uses the codeword x[- . If transmitter 2 additionally has the private message k and knows that W G Wr2, it 
uses the codeword yj^ . Suppose that the receiver knows that W G Wp. If the channel output z G Z" is contained 
in Ff-f^, the receiver decides that the message triple {i,j, k) has been sent. 

Definition 3. For A G (0, 1), a codecM(«, Mq, Mi, M2) is a codecMl"^, Mq, Mi, M2, A) if 

Ti,r2,p ivew^.^nWp M0M1M2 ^—^ ■' 

That means that for every instance of channel knowledge at the transmitters and at the receiver, the encod- 
ing/decoding chosen for this instance must yield a small average error for every channel state that may occur 
according to the CSI. In other words, the code chosen for a particular instance (ti ,T2, p) of CSI must be universally 
good for the class of channels {We Wrira H Wp}. 

The first goal in this paper is to characterize the capacity region of the compound MAC with common message. 
That means that we will characterize the set of achievable rate triples and prove a weak converse. 

Definition 4. A rate triple {Rq, Ri, R2) is achievable for the compound channel (W, ^1,^2, r) with common message 
if for every e > and A G (0, 1) and for n large enough, there is a codecM(", Mq, Mi, M2, A) with 

— log Ml, > Ru — £ for every i^ = 0, 1, 2. 

n 

We denote the set of achievable rate triples by Ccm(W, ti,t2,r). 

Before stating the theorem on the capacity region, we need to introduce some new notation. We set Hi to be the 
set of families 

P -^ {PTiT2{u,x,y) ^ po{u)piri{x\u)p2T2{y\u) : {ti,T2) G Ti X T2}, 

of probability distributions, where pq is a distribution on a finite subset of the integers, and where {pin , P2T2 ) "= 
IC{X\U) X lC{y\U) for each (ti, T2). Every p G Hi defines a family of probability measures onU x X x y x Z, 
where U is the set corresponding to p. This family consists of the probability measures pw (W G W), where 

pw{u,x,y,z) ^po{u)pir,{x\u)p2T2{y\u)W{z\x,y), (4) 

and where (ti,T2) G Ti x T2 is such that W G Wtits- Let the quadruple of random variables (U,Xri,Yr2, Zw) 
take values in UxXxyxZ with joint probability pw- Then, define the set TZcm{p, ti,T2, W) to be the set of 
{Rq, Ri, R2), where every R^ >i) and where 

Rl<I{Zw;Xr,\Yr2,U), 

R2<I{Zw:Yr2\X,„U), 

Rl+R2<I{Zw;Xr„Yr2\U), 
Ro + Rl + R2 < I{ZiY',Xri,Yr2)- 
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Defining 

C^m(W,^i,^2):= U n n ^CM(p,Ti,r2,W^), 

we are able to state the first main result. 

Theorem 1. For the compound MAC (W, ti,i2, ?'), one has 

CcMiW,ti,h,r) =C^^iW,h,h), 

and there is a weak converse. More exactly, for every {Rq, i?i, R2) in Ccm(W, ii, ^2, r) and for every e > 0, there 
is a C, such that there exists a sequence of codescuin, Mq , Af}" , M2 , 2^"^) fulfilling 

-logA/(") > i?^-e, i/ = 0,l,2, 
n 

if n is large, i.e. one has exponential decay of the error probability with increasing blocklength. 

Ccm(VV, ii, ^2, J') is convex. The cardinality of the auxiliary set 14 can be restricted to be at most niin(|A'||3^| + 

2, |Z|+3). 

Remark 2. 1) A weak converse states that if a code has rates which are further than e > from the capacity 
region and if its blocklength is large, then the average error of this code must be larger than a constant only 
depending on e. A moment's thought reveals that this is a stronger statement than just saying that the rates 
outside of the capacity region are not achievable. 

2) CcM{yV,ti,t2,r) is independent of the CSIR partition r. That means that given a certain CSIT, the capacity 
region does not vary as CSIR varies. A heuristic explanation of this phenomenon is given in ll22l Section 4.5] 
for the case of single-user compound channels. It builds on the fact that the receiver can estimate the channel 
from a pilot sequence with a length which is negligible compared to the blocklength. 

3) Note that first taking a union and then an intersection of sets in the definition of CQf^{yV,ti,t2) is similar 
to the max-min capacity expression for the classical single-user discrete memoryless compound channel |41. 
We write two intersections instead of one in order to make the difference clear which remains between the 
two expressions. Recall that the p E Hi ms families of probability measures. Every choice (ti,T2) G Ti x T2 
activates a certain element of such a family p. The union and the first intersection are thus related in a more 
complex manner than in the single-user expression. 

4) As CSIT increases, the capacity region grows, and in principle, one can read off from this how the region scales 
with increasing channel knowledge at the transmitters. More precisely, assume that there are pairs (^1,^2) and 
{t[,t'2) of CSIT partitions, 

U - {W,„ : T, e T,}, tl = {W;, : r, G T^} [v = 1, 2), 

such that t'^ is finer than i,, (1/ = 1, 2). That means that for every W^, £ t'^ there is a r,, e T^, with W^, C Wr^, 
so one can assume that Ti, C T^. Observe that the Hi corresponding to (^1,^2), which we call I\-i{ti,t2) only 
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Fig. 2. The MAC with Conferencing Encoders 



in this remark, can naturally be considered a subset of Iii{t'i,t'2), which denotes the Hi corresponding to 
(t'j^jij) only for this remark. Thus 

u n n ^cm(p,ti,t2,w^)= u n n T^cu{p,r[,T^.w), 

peni(ti,t2) (Ti,r2)GTixT2 wew^^xj peni(ti,t2) (r'T')GT'xT' wew , , 

and it follows that Ccm(W, ti, ^2,?") C C(W, ^i, t2,r). 

C. The MAC with Conferencing Encoders 

Again let the channel (W,ti,i2,'") be given. Here we assume that each transmitter only has a set of private 
messages [l,Mi,] {v — 1,2) for the receiver Encoding is done in three stages. In the first stage, each encoder 
transmits its message and CSIT to a central node, a "switch", over a noiseless rate-constrained discrete MAC. The 
rate constraints are part of the problem setting and thus fixed, but the noiseless MAC is not given, it is part of 
the code. For reasons that will become clear soon, we call it a "conferencing MAC". In the second stage, the 
information gathered by the switch is passed on to each encoder over channels without incurring noise or loss. The 
codewords are chosen in the third stage. Each encoder chooses its codewords using three parameters: the message 
it wants to transmit, its CSIT, and the output of the conferencing MAC. This is illustrated in Fig. |2] 

The conferencing MAC can be chosen freely within the constraints, so it can be seen as a part of the encoding 
process. Assume that the blocklength of the codes used for transmission is set to be n. The rate constraints (Ci, C2) 
are such that nCy is the maximal number of bits transmitter v can communicate to the receiving node of the 
conferencing MAC. Thus if transmitter 1, say, has message j and CSIT ti, then transmitter 2, who knows neither 
j nor Ti, can use at most Ci additional bits from transmitter 1 to encode its own message. Consequently, there is 
a limited degree of cooperation between the encoders enhancing the reliability of transmission. As the constraints 
on the noiseless MAC are measured in terms of n, one can interpret the communication over this channel as taking 
place during the transmission over (W, ii, i2, »') of the codeword preceding that which is constructed with the help 
of the conferencing MAC. 

Example |2] below shows how this kind of coding generalizes coding using Willems conferencing functions as 
defined in ll20l . II2TI . From Theorem [2] below it follows that Willems conferencing is more than just a special case. 
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In fact, it suffices to achieve the capacity region. In Section [V-AI we give an application where it is useful to have 
the more general notion of conferencing which is used here. 

We now come to the formal definitions. Recall that a noiseless MAC is nothing but a function from a Cartesian 
product to some other space. 

Definition 5. A codecoNF("', -^i) Af2:C'i7C'2) is a quadruple {fi, f2,9,^) of functions which satisfy 

/i:[l,Mi] xTxTi^A-", 
/2: [l,Af2] xTxT2^y', 
g: [l,Mi] X [1,M2] xT^xT^^r, 
$ : Z" X i? ^ [1, Ml] X [1, Ma], 
where F is a finite set and where g satisfies 

- Iog||50-,r0 II < C2 for all (j, n) e [1, Ah] x Ti, (5) 

n 

i log||<7(fe,r.) II < C\ for all (fc, T2) G [1, A'h] x T2 (6) 

for the functions 5(j^^^) and g(^k,T2) defined by 5(j^^^)(j, fc,Ti, rj) = g(fe,r2)(j: fc,n, T2) = g(j, fc, n, ra). The number 
n is called the blocklength of the code, g is called a conferencing MAC or alternatively a generalized conferencing 
function. The latter name is justified by Example |2] 

Remark 3. Analogous to the situation for the MAC with common message described in Remark [T] the codecoNF 
{n, Ml, M2, Ci, C2) given by the quadruple (/i, f2,9, $) uniquely determines a family 

{ (xjr^ , yjr^ ,F^,):ij,k)e[l,Ah]x[l, M2] , (ri , T2 , p) £ Ti x r2 x i?} . (7) 



For the elements of this family, x^^^^ G X" (not necessarily different!), y^k'^ G 3^" (not necessarily different!), 

r- ■7'n oof-; of,, 
jk 



and the Fl*, C Z" satisfy 



F,';,nF;,, = if(j,fc)^(/,fc'). 

For every (ti,T2) S Ti x T2, the family (|7]l must satisfy 

xjr=4fe'^ if 5(j,fc,n,T2)=5(j,fc',n,T^), (8) 

yjr=y/fc'' if 5(j,fc,ri,T2)=5(/,fc,r{,T2). (9) 

Thus an alternative definition of codescoNF would be families like the family (O together with conferencing MACs 
as in ^ and (|6]). This is the form we will mostly use in the paper because of shorter notation. However, the original 
definition |5] is more constructive and gives more insights into the practical use of such codes. It will be used in the 
converse, where the way how the codewords depend on the messages will be exploited. 
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Remark 4. Note that ([Sj and (|6]l really are rate constraints. Indeed, let (5*1, 6*2) be a rate triple achievable by the 
MAC defined by g, where the average error criterion is usecu. Then by the characterization of the MAC with 
non-cooperating encoders without common message (cf. ID Theorem 3.2.3]), there must be independent random 
variables J on [1, Mi] x Ti and K on [1, M2] x T2 such that 

Si < IigiJ,K);J\K) ^ H{giJ,K)\K), (10) 

S2 < IigiJ,Ky,K\J) = Hig{J,K)\J), (11) 

Si + S2< ligiJ, K); J, K) = H{g{J, K)). (12) 

But by the constraints Q and (|6]l, one knows that the right side of ( fTOl i must be smaller than nCiand the right 
side of ( fTTT i must be smaller than nC2- Clearly, the sum rate then must be smaller than n{Ci + C2). Moreover, as 
the bounds in (fT0]l- (fT2] i are achievable, it even follows H{g{J, K)) < n{Ci + C2) for every admissible choice of 
J and K. 



With the above definition, the coding scheme is obvious: if the message pair (j, k) is to be transmitted and if 

■jk and y.,^ 

3k^ 



the pair of CSIT instances is (ri,T2), then the senders use the codewords x^j,'^^ and y^j^'^^, respectively. If CSIR 
is p and if the channel output is contained in the decoding set F'^^., then the receiver decides that the message pair 



(j, k) has been transmitted. 

Definition 6. For A e (0, 1), a codecoNF('^, Afi, M2, Ci, C2) is a codecoNF(?^, -^1, Af2, Ci, C2, A) if 

In the following example, we prove our claim that using generalized conferencing in the encoding process 
generalizes WiUems' conferencing encoders. We fix the notation 

(13) 

Example 2 (WiUems Conferencing Functions). Let positive integers V\ and V2 be given which can be written as 
products 

K = K,i ■ • • K,/ 

for some positive integer / which does not depend on v. Assume that 

-logK<a. 
n 

'Even though the channel is noiseless, this does make a difference. In fact, Dueck showed in JSJ that the maximal and the average error 
criteria differ for MACs using the example of a noiseless channel! 
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We first give a formal definition of a pair of Willems conferencing functions (51 , 52 ) ■ Such a pair is determined in an 
iterative manner via sequences of functions hi^i, . . . , hij and h2,i, • ■ • , ^2./, where for i^ = 1,2 and i — 2, . . . ,1, 

K,i: [1,A/,] xT, ^[l,V;,i], 

K^, : [1,A/,] X T, X [l,l/p,i] X ... X [l,V^,.«-i] ^ [l,K,z]. 

For i^ = 1, 2 and i = 2, . . . ,1, one recursively defines functions 

/i^i :[1,A4] xT, ^[l,K,i], 
/i^, : [l,Mi] X [1,M2] X Ti X Ts ^ [l,K,d 
by 

^i^,»(^l'^2,Ti,T2) = /li/,4CT^>p,l(4,Tp),...,/lpj_i(£i,4,ri,T2)). 

The functions gi , (72 are then obtained by setting 

9>^ ■= i.K,ii---^K.,i)- 

One checks easily that g = (.gi, 32) is a noiseless MAC with output alphabet F = [1, 14] x [1, V2] satisfying (|5]l and 
(|6]l. Clearly, 5,y(£i,^2,Ti,T2) is known at transmitter v because it only depends on {^^tT^) and 55(^1,^2, '''i, ■'■2) ■ 

Note that not every conferencing MAC g = (51, 52) with output alphabet [1, Vi] x [1, V2] can be obtained through 
Willems conferencing. The most trivial example to see this is where Vi is prime and where the conferencing function 
gi mapping into [1, Vi] depends on k. However, this setting can be given an interpretation in terms of MACs. Every 
pair of Willems' conferencing functions is nothing but the /-fold use of a non- stationary noiseless MAC with 
feedback. The above description of a transmission block of length / over such a "Willems channel" as the one-shot 
use of a noiseless MAC as above is possible because noise plays no role here. 



For achievability and weak converse, we adapt the definitions from III-Bl to the conferencing setting. Let Ci, C2 
be nonnegative real numbers at least one of which is strictly greater than 0. 

Definition 7. A rate pair [Ri, R2) is achievable for the compound channel (W, ii, ^2, f) with conferencing encoders 

with conferencing capacities (Ci,C2) if for every e > and A G (0,1) and for n large enough, there is a 

codecoNF("-, Ml, M2, Ci, C2, A) with 

-\ogM^ >R^-e 
n 

We denote the set of achievable rate pairs by Cconf(VV, ii, ^2, r, Ci, C2). 

To state the result, we need to define the sets 7?.conf- We denote by 112 the set of families 

P^ {PTiT2{u,x,y) :^ po{u)piriT2{x\u)p2TiT2{y\u) i {ti , T2) E Tl X T2} 
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of probability distributions, where po is a distribution on a finite subset U of the integers and where (piriT2 1 P2tit2 ) G 
JC{X\U) X JC{y\U) for every (n, r2) G Ti x r2 (cf. the definition of Hi in Subsection III-Bb . Every p G 112 defines 
a family of probability measures pw (W £ W) on UxXxyxZ, where U is the set corresponding to p. This 
family consists of the probability measures pw (W G W) defined by 

pw{u, X, y, z) := Pa{u)pir^r2 {x\u)p2r^T2 {y\u)W{z\x, y), 

where (ti, T2) G Ti x T2 is such that W G WriT2- Finally we define subsets li^ and 114 of 112. Hs consists of those 
p G 112 where the piriT2 do not depend on T2 and 114 consists of those p G 112 where the p2Tir2 do not depend on 

For W G Wt, let (f7, X^, IV, Zvf) be a quadruple of random variables which is distributed according to pw- 
The set 7?.conf(p, W, Ci, C2) is defined as the set of those pairs (i?i, R2) of non-negative reals which satisfy 

R2<I{Zw;Yr\Xr,U)+C2, 
Rl+R2< {l{Zw; Xr, Yr\U) +Ci+ C2) A I{Zw;Xr, Yr) ■ 

If Ci , C2 > 0, define the set 

CcoNFi^,h,t2,Ci,C2):= U n n ncoNF{p,W,Ci,C2). 

pen2 (Ti,T2)eTixT2 w^ew^i^j 

If Ci > 0, C2 = (the reverse case is analogous with 114 replacing II3), define the set 

CcoNF4(W'^i'*2,Ci):= U n n T^CONFip,W,Ci,0). 

pens {TuT2)eTixT2 wew^-^^^ 

Theorem 2. For the channel (W',ti,t2,^) <^nd the pair (Ci,C2) of nonnegative real numbers, one has 

'Q0NF(W,il,i2,Ci,C2) ifCi,C2>Q, 

CcONF,lO^,h,t2,Ci) ifCi>0,C2=0, 

C^Oj,P2{W,ti,t2,C2) ifCi=0,C2>0. 

This set can already be achieved using one-shot Willems conferencing functions, i.e. functions as defined in Example 
^with 1=1. More exactly, for every (i?i, R2) G Cconf(VV, ii, ^2, f, Ci, C2) and for every e > 0, there is a C, such 
that there exists a sequence of codescoNF{n, M^ , Af 2 1 C'lj C'2, 2^"'') fulfilling 

-logA4") >i?, -e, z/=l,2 
n 

for large n and using a one-shot Willems conference. Cconf(VV, ti, ^2, ''i C*!, C2) is convex. One also has a weak 

converse. Further, the cardinality the auxiliary set U can be restricted to be at most Tiim.[\X\\y\ + 2, \Z\ + 3). 

Remark |2] applies here, too. Further, we note 
Remark 5. If Ci, C2 > 0, then C2onf(W> ^i, ^2, Ci, C2) = Qonf(>V, t, t, Ci, C2), where 

t={Wr,r2 :(ti,T2)GTi XT2}. 
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Thus bidirectional conferencing leads to a complete exchange of CSIT. The capacity region only depends on the 
joint CSIT at both transmitters, the asymmetry is lost. 

Before beginning with the proof in the next section, we use Theorem |2] to find out how much cooperation is 
necessary to achieve the full-cooperation performance, i.e. the performance achieved when Ci — C2 — cxd, if 
cooperation in both directions is possible at all. (So we do not ask how large Ci must be if C2 — 0.) By Theorem 
121 the region of rates achievable with full cooperation is given by 

< i?i + i?2 < C°° := max min inf /(Z„; Xr-Yr). (14) 

pGn2 TeTixT2 WGWx ' 

C°° also determines the maximally achievable sum rate. 

Let AA be the set of those p G 112 which achieve the maximum in (fT4l i. Then 

Corollary 1. 1} The infinite cooperation sum capacity is achievable if and only if 

Ci + C2 > C°° - max min inf I(Zw] Xr,Yr). (15) 

M TeTixT2 W£W-r 

2) The full cooperation region is achieved if 

(71X7°° -max min inf IiZw;XJYr,U), 
pena TSTixTs wew^ 

C2>C°°-max min inf /(Zh/; F^IX^, [/). 
pen2 TGT1XT2 weWx 

In particular, infinite-capacity cooperation is neither necessary in order to achieve the full-cooperation sum rate 

nor to achieve the full-cooperation rate region. 

Proof 1) Denote the maximal sum rate achievable with cooperation capacities Ci, C2 > by (7(Ci, 6*2). As 
for C°°, the problem of finding C(Ci, C2) is a maximization problem: one has 

C(Ci,C2) = max min ini {I{Z„;Xr,Yr\U) + Ci+ C2) M{Z„]Xr,Yr). 
pen2 TeTixT2 rjew^ 

The equation 

C(Ci,C2)>C°° (16) 

holds if and only if there is a p G 112 such that 

min inf {I{Zw]Xr,Yr\U) + Ci + C2) M{Zw]Xr,Yr) > C°° . 
TeTixT2 wew^ 



That means in particular that 



so p must maximize 



Then (fT6] l is equivalent to 



min inf I{Zw;Xr,Y-r) > C° 



min inf I{Zw',Xr,Yr). 



max min inf iliZw] Xr,Yr\U) + Ci + C2) > C° 

M reTixTs WeW^ 
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and this proves ( fTSl l. 

2) This part is trivial. ■ 

In Section [Vl we present a numerical example which shows how the rate region changes with the conferencing 

capacities. 

III. The Achievability Proofs 

A. The MAC with Common Message 

The proof of the achievabiUty of C^[^(W, ti, ^2) proceeds as follows. We first show that C(^^^(yV, ti, ^2) is 
achievable using random codes, where codewords and decoding sets are chosen at random and the error is measured 
by taking the mean average error over all realizations. For this part, we adapt the nice proof used by Jahn [HI in 
the context of arbitrarily varying multiuser channels to the setting of the compound MAC with common message. 
It uses some hypergraph terminology. An alternative proof proceeding as in standard random coding can be found 
in ifTSJI . It uses the same encoding and the same decoding, but needs the additional assumption that |W| < 00. 
Next, we derandomize, i.e. we extract a good deterministic code from the random one. This is much easier than 
for arbitrarily varying channels. It is first done for |W| < 00, and then an approximation argument is used for the 
case |W| = 00. 

We assume here that the receiver has no CSI and show that Ccm(W, ti, ^2) is achievable. This gives an inner 
bound to the capacity region for arbitrary CSIR-function r. As p is trivial in the no-CSIR case, we omit it in the 
notation. 

1) Hypergraphs: A cubic hypergraph is a discrete set of the form U y. X x y with a collection E of subsets 
EdUxX xy. 

Definition 8. Consider a family {{Ui,Xij,Yik) : i G [l,Afo], i G [l,Afi], k £ [1,M2]} of random vectors, 
where the Ui take values in U, the Xij take values in X, and the Yik take values in 3^. This family is a random 
(Mo, Ml, M2)-half lattice in iY x A" x 3^ if the family 

{{(C/„X„-,r,fe) : (j,fc) e [l,Mi] X [1,M2]} : i G [l,Afo]} 

of random vectors is i.i.d. and such that given Ui, 

• the pair of families {Xij : j € [1, Mi]}, {Yik : k £ [1, M2]} is conditionally independent, 

• the family Xij, where j G [1, Afi], is conditionally i.i.d, 

• the family Yik, where k G [l,Af2], is conditionally i.i.d. 

Let a random (A/q, Mi, A'f2)-half lattice on U x X x y he realized on a probability space {n,T,V). For any 
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E eE, {i,j, k) e [1, Mo] X [1, Ml] x [1, Ma], and {u,x,y) eU x X x X we defin^ 

Pe{i,]M -^^E r\{{U., ,X,,,, ,Y,,k') : i' ^ i,j\k'} ^ 0\{U,,X,j,Y,k) ^ {u,x,y)\, 

PE\u{hh k) ■■= P[E\u n {(X,,v, r,,,) : f ^ J, k'^k}^ 0\{U,,X,,,Y,k) = {u, x, y)] , 

PE\(u^x){hJ,k) := P[£;|(„,^) n {Y^k' ■■ k' 7^ k] ^ 0\{U^,X,j,Y,k) = {u,x,y)], 

PE\iu,y){i,J,k) := P[E\^u,y} n {X„v : j' ^ j} ^ 0\{U„X,,,Y,k) = iu,x,y)]. 

We now state an analogue to the Hit Lemmas in |i8j which, just like those, is proved immediately using the 
independence/conditional independence properties of the random (Afo, Mi, M2)-half lattice and the union bound. 

Lemma 1. For a random {Mq, Mi, M2) -half lattice on U x X x y, and for any E £ £, {i,j,k) e [l,Mo] x 
[1, Ml] X [1, M2], and {u,x,y) eU x X x y, 

PEihJ, k) < M0M1M2P [(C/i, Xu, yii) e E] , 
PE\u{hJ,k) < MiM2P[(Xii,rii) e E\u\ Ui^u], 

PE\(u,y){hJ,k) < MiP [Xii G S|(„^y)| Ui=u], 

PE\iu,cc){^,J,k) < M2P [Yn e ^|(„,,)| C/i = u] . 
Hence, for any probability measure p on (U x X x y) x £, 

X! P{'^^x,y,E){PE{i,3.k) 

u.x.,y.E 

+ PE\u{hj,k) + PE\{u,x){hj,k) + PE\{u,y){hj,k)) 

< M0M1M2 maxP [{Ui,Xn,Yii) e E] 

E 

+ M1M2 maxP [(Xii, Fii) e E\u\ Ui = u] 

E.u 

+ Ml max P [Xu e Eh^^y) \Ui ^ u] 

E,u,y 

+ M2 max P [^11 e -B|(n.:c) \Ui=u]. 

2) The Encoding/Decoding Procedure: We can now return to the proof of the achievability part of Theorem [T] 
Let the channel (yV,ti,t2) be given (recall that the receiver is assumed to have no CSIR). We define a random 
code with block length n which encodes A/q common messages, A/i messages of the first transmitter, and Af2 
messages of the second transmitter. The randomness of the code can be viewed in two ways. First, one can see 
it as a method of proof which allows us to find a number of codes from which we will select a good one later. 
However, the randomness could also be incorporated into the system. Given that the transmitters and the receiver 
have access to the common randomness needed in the definition of the code, this already gives an achievable rate 
region if this randomness is exploited in the coding process. During the proof, one will see that this region even is 

^Recall the notation defined in tlie Introduction. 
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achievable using a maximal error criterion. One needs to use the average error criterion when the achievability proof 
for random codes is strengthened in order to obtain the desired achievabihty part of Theorem [T] which requires the 
use of deterministic codes. 

Using the notation introduced before Theorem[T] we define an i.i.d. set of A/q i.i.d. families of random variables 

{((7„X;;,y,;^) : (j, fc) e [l,Ah] x [l,Ahl{T,,T2) e Ti x T2}. Let 

P - {P0i-)pir^i-\-)p2rA-\-) ■■ (n,T2) G Ti X T2} G Hi 

and let U be the corresponding finite subset of the integers. The distribution p^ of each Ui on U" is the n-fold 
product of pq. Given Ui, the rest of the random variables in family i is assumed to be conditionally independent 
given Ui. The conditional distribution p"^^ of each X^l given Ui on A:"" is the n-fold memoryless extension of 
Pin , and the conditional distribution p^r, of ^^'^h X-^^ given Ui on 3^" is the n-fold memoryless extension of P2t2 • 
Given a message triple {i,j, k) that is to be transmitted and given an instance (ri , T2) of CSIT, the transmitters use 
the random codewords Xjj and Y^j^ . 

We now define the decoding procedure, which requires access to the same random experiment as used for 
encoding. Fix a S > 0. The p used in the encoding process and every M^ G W define a probability measure pw as 
in (|4]i. For every t = {ti,T2), define a set 

E^--= U Tpw^s 
weWr 

(cf. the notation section in the Introduction). This set does not depend on the state W G Wt- The decoding sets 
are defined as follows: Fijk consists exactly of those z G -E" which satisfy both of the following conditions: 
• there is a (Ti,r2) such that 

m for all (i',j',k') ^ (i,j,k) and for all {ti,T2), 

iu,,xi,),,Y;i,)^E^U. 

Clearly the Fijk are disjoint. This decision rule does not depend on r, nor on W. 

3) Bounding the Mean Maximal Error for Random Coding: We now bound the mean maximal error incurred 
by random coding, i.e. for each message triple {i,j,k), CSIT instance t = (ti,T2), and channel state W G Wt, 
we ask how large 

E[W^iF^^,\Xll,Y^l^)] (17) 

can be. The receiver makes an error (decides incorrectly) if for the channel output z, one of the following holds: 

El) iU,,Xjj,Y^) i E-'\^ for all r' = {t[,t^), 

E2) there is an i' ^ i and arbitrary {j',k') and r' = {t[,T2) such that 
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E3) there is a j' ^ j and a k' ^ k and arbitrary t' = (r{, T2) such that 
E4) there is a j' ^ j and arbitrary t' = {t[,T2) such that 
E5) there is a fc' ^ fc and arbitrary r' = (t{,T2) such that 

The mean probabihty of the event described in (El) is upper-bounded by 



19 



E 



J2w-iz\Xll,Y^^^)l^ 



Note that the joint probabihty of the triple {Ui,Xj^ ,Y^i^) and the channel output is p^. Lemma [T] from the 
Appendix then implies that the above term can be bounded by 

We now bound the probability that one of the events (lQ-(lQ holds for some fixed (t{, T2). To this end we use 
Lemma [T] The pair iU'"' x A'" x 3^",£), where £ ~ {E'^ |z : z G 2"}, defines a cubic hypergraph. Further, the 
collection of random vectors 

{{u,,x%Xid^^,j'^k'} 

is a random (Mq, Mi, M2)-half lattice on U"^ x A"" x 3^". One obtains a probability measure on W" x <¥" x J^" x f 
via 

g(u,x,y,£;|,) = W^"(z|x,y)P[(C/„X;/,y^;^) = (u,x,y)]. 

We then obtain for fixed (t{, Tj) that 



E 



ZZ ^"(^l-'^*7' ^ifc')l{(lEl. (lEl. (Iffl. o>- OS holds for r' 



< Y. Q(u,x,y,£;|z)(P[(^ holds for r'|C/, = u,Xy=x,r,fc=y] 



u,x,y,z 



\{M holds for t'|L/, = u, X,, = x, Y,k = y] 
'[(^ holds for T'\U^^ u, Xy = X, y,fe = y] 
[{m holds for r'|[/, = u,Xy = x,y,fc = y]). 
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By the half-lattice property and Lemma [T] the above term can be upper-bounded by 

MoMi A/2 maxP[(;7i, X[j , Y^^) e E^' |,] (19) 

Z 

+ MiAf2 maxP[(X[j , Y^l) e S"' |(,.u) |C/i - u] (20) 

z,u 

+ AfimaxP[X[i G S^' |(z u y)|C/i = u] (21) 

z,u,y '- -^-^ i\ , ,j n 

+ M2 maxP[y;/ e S"'|(z,u,x)|C/i = u]. (22) 

z,u,x 

It remains to bound the expressions (fT9]l-(l22b. For every W G Wt', let the random vector {U, X^i , F^/ , Zw) have 
distribution pvF- We use Lemma |6] a) and |9] from the Appendix to bound iT% by 

M0M1M2 2-"('"f'*-'ew^, /(Zv,,;C/,X^, ,Y^,)-Ci)_ 

This equals 

because the sequence ([/, [X^^ , Yr^], Zw) forms a Markov chain. Here, Ci is an error term which depends on S and 
which converges to zero as 6 tends to zero. Using Lemmas |6]b) and Lemma |9] from the Appendix, we see that the 
terms in (l20li- (l22l i can be bounded by 

MiAf2 2~"*'"^""^^^' ■f(^»''^-^'^-^l^)-«^\ (24) 

M 2^"*'"*""^"'^' "^^^"'''^'^2''^^('^^~''^^ (25) 

respectively. Here, again, C2, Cs: C4 depend on 6 and converge to zero as 6 tends to zero. The bounds in ( |25] | and 
can be reduced to 

Af 2~"''^"^^"''^'^r' -"•^^'^^i^^^2'^^~'''^^ (27) 

For (|27] |. this follows from 

/(Zh., i;^; X,- |L/) = /(n-- ; X,j |C/) + I{Zw; X^^ \Yr^ , U) = liZw^X,, {Y,, , U), 

where the chain rule for mutual information was used and the fact that X^' and IV^ are conditionally independent 
given U. The bound (|28] | follows in an analogous way. Collecting (fTsl l and, for each {t[,T2) £ Ti x T2, the bounds 
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( |23] l. (|24] |. ( |27] l. and ( |28] l. we obtain an upper bound for the mean maximal error defined in ( fTTj l of 

+ \Ti\\T2\MiM2 2^"'™"-'eTixr2 infiv'ew^, / (Zvi^;^,-i',^ |C/)-C2) 

Note that this bound is uniform in W. It tends to zero exponentially with rate C > if 

-log(AfoMiM2) < min inf I{Zw; X< ,Y''!^) ~ Ci ~ C, 

- log(Mi Afa) < min inf I{Zw; X< , Y^'^ \U) - Q2 ~ L 

n r'GTixTa 14"eW^, 

ilogAfi< min inf /(Zm/; X^i |y^^ [/) - Cs - C 

n T'eTixT2W'ew^, 

-\ogM2< min inf I{Zw;Y^''\X^' ,U) ~ Ci - C, 
n T'GTixTa wew^, 

for some 5 > 0. 

Now assume that (i?o,-Ri,^2) is contained in CQf^{'W,ti,t2)- Hence, there is a p e IIi such that 

(i?0,i?l,i?2)e n n ■RcMip,T[,T!2,W'). 



(29) 



For n large, we can find numbers Mq, Mi, M2 satisfying 

n 2 

Choose 6 and C, such that Ci ^C2 AC3 AC4 + C ^ ^/2- Inserting this in ( |29] l establishes the existence of a sequence of 
random codes whose mean average error converges to with rate Q. Hence, for every (i?o, ^1, ^2) G C*{W, ^1,^2), 
one can find random codes according to the procedure described above with rates close to (_Ro, ^1, ^2) and with 
an exponentially small maximum error probability. 

4) Extracting a Deterministic Code for \W\ < cxo; The next step is to extract a deterministic code with the 
same rate triple and with small average error from the random one. This is easy when |W| < 00, an approximation 
argument similar to the one in ||2l solves the problem for \W\ ~ 00. So let us first assume that \W\ < cxo. For 
T — (ti, T2) E Ti XT2 and W G Wt, we define on the underlying probability space {fl, T , P) the random variable 

This gives the average error for a channel state W G W^ and the random code determined by the elementary event 
w G ri. For every VF G W and every {Ro,Ri,R2) in CQ^{'W,ti,t2), we found above a random code with block 
length n and message set [1, Af^"^] x [1, Al["'] x [1, M^ ], and a C > such that 
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and (1/n) logMJ)"' > Ru — s for v = 0, 1, 2 if n is large (the bound on the mean maximum error a fortiori also 
holds for the mean average error). For < C < C> define the set 

If nvi/ew ^w^ '^ nonempty, we can infer the existence of a deterministic codecM("-, Mq',M{', M2 , 2^"'') with 
exponentially small error probability. And indeed, the Markov inequality implies 



W£W w&w 



> 1 - 2"^ J2 ]E[Pj^] 
wew 

> 1 - |W|2-"('^^-'^) > 0, 



so HvF ^M-' must be nonempty. This proves the existence of a deterministic codecM("-, A/q , MJ;"%M2 , 2""'') 
with exponentially decaying average error probability for every (i?o,^i,^2) G C*{W,ti,t2), so this whole set is 
achievable. 

5) Approximation for \W\ = cxa.- For a positive integer N to be chosen later, we first define an approximating 
compound discrete memoryless MAC. For every W G Wn, W{z\x,y) is a multiple of (2A^|ri||r2|)^^ for all 
X e X,y ey,z e Z. Clearly, |yVjv| < (2iV|Ti||r2| + l)l'*'ll^ll^l. The following is a slight variation of JH Lemma 
4]. 

Lemma 2. For every N > 2\Z\, there is a function / : W ^ Wjv satisfying fiyVr) H /(Wt') — if t ^ t' such 
that for every W eW, 

\W{z\x,y)-f{W){z\x,y)\<\^, (30) 

W{z\x, y) < exp (^^) f{W){z\x, y). (31) 

Let N be as in the lemma and let f^ be the corresponding function from W to Wn- Let P G Hi, r — (ri, T2) G 
Ti X T2, and W £ Wt- By ( [30l ) and 14| Lemma L2.7] (which quantifies the uniform continuity of entropy), one 
has the inequalities 

\Z\^ \Z^ 

\I{Zw ; Xr^ ^Yr^) - I{Zf^ (W) ; Xt^ , >^r2 ) I < -2 

\I{Zw',Xri\Yr2, U) - I{Zf^(_w)',Xr:^\Yr2, U)\ < —2 
\IiZw;YrJXr„U)-IiZf^^Wy,Yr,\Xr„U)\<-2 ^^ 

Now fix a triple {Rq, Ri, R2) which is contained in the interior of Cc]y[(W, ^1,^2)- The above inequalities imply 
that for large N it is contained in the interior of CcmI/wCW), ^1,^2) defined through the channel (/iv(W), ii, i2)- 
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Here, the necessarily finite partitions i^ = {VVr^ C Wat : t^ G 21/} (i^ = 1, 2) of Wat are defined by 

Wr„=/Ar(>V,J. 

recall ( fTsT l. These really are partitions by Lemma |2] The achievability result in IIII-A4] established the existence of 
codescM(n, A^o"^^l"^*^2"^2-"?) for the compound MAC (/Ar(W), tiJa) such that 

For N large enough, one has {l/n)\ogMv > Ry — e. Then, the above sequence of codes for (WAr,fi,i2) 
has the desired rates for (W,ii,i2)- It remains to bound the average error incurred when applying the codes for 
transmission over (yV,ii,i2)- For fixed n, let the codecM("-, -M^(j" , Af}" , Mj , 2^"'=) have the form (|3]l. For any 
W e VVriT2, <EB implies that the average error can be bounded by 



MqMiM2^^ ■> ^J''^"'- M0M1M2 

<r f f^^ 2121 

< exp —n C in 2 



N 

By enlarging N if necessary, this goes to zero as n approaches infinity, so one obtains an exponentially small average 
probabiUty of error. One checks easily that the existence of a sequence of codescM('^, Mq , M|" , Mj , 2^"^) 
with (1/n) log-Aft" > R^ — e for every (i?o, i?i, i?2) in the interior of CQf^{yV , ti , 12) implies the existence of 
such a sequence also for the rate triples lying on the boundary of CQf^{W, ti, ^2). 

6) Convexity and Bound on \U\: The convexity of C*{W, ^1,^2) is clear by the concavity of mutual information 
in the input distributions. The bounds on \U\ follow in the same way as in 12011 . 

B. The MAC with Conferencing Encoders 

The achievability part of Theorem |2] relies on the achievability part of Theorem [T] We first define the Willems 
conferencing functions that will turn out to be optimal for large blocklengths in the course of the proof. Then, we 
show how Theorem [T] can be applied to design a codecoNF from a codecM using these conferencing functions if 
certain conditions on the rates are fulfilled. Next, we show that these conditions can be fulfilled. Finally, we show 
that the average error of the conferencing codes thus defined is small. As in the achievability proof for the MAC 
with common message, it suffices to assume that the receiver has no CSIR. 

1) Preliminary Considerations: Let [1, Afi] and [1, Af2] be message sets, let n be a blocklength, and let Ci,C2 
be conferencing capacities. If n is large enough, we can construct a pair of simple one-shot Willems conferencing 
functions (cf. Example |2|l with these message sets which will be admissible with respect to n and Ci,C2. The 
blocklength needs to be large enough to ensure the existence of positive integers Vi , V2 with 

-log|T,| < ilogK <a (^^ = 1,2). (32) 

n n 



Then define 



fj-f 



T„ 



A Aft 
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and 



?. := 






if /i^ > 2 
if ^y = 1. 



Every ii, 6 [1, Afi^] can be written uniquely as 



<•!/ \^V -^Jsi^ ^T' -t-^; 



(33) 



where i,y G [1,/^!/] and where 



te 



[l,C. 



if ii, < ^i„ - 1, 



The conferencing function g^ : [1, M^] xT^ ^ [1, ^v] x T^ can now be defined by 

Note that by ^, 

-log|[l,Ai.] xT,| < -logK<a, 



(34) 



(35) 



so gi, is an admissible one-shot Willems conferencing function. 

2) Coding for Ci, C2 > 0; Now we show how to construct a codecoNF using the conferencing functions defined 
above and the codescM whose existence was proved in IIII-AI We assume Ci,C2 > 0. Let (i?i,i?2) be contained 

mC^o^p{W,ti,t2,Ci,C2). Set 



Rq :— Ri + R2. 






Then (i?o, i?i,i?2) is contained in CcM(W,t,i), defined through (W,t,i), where the CSIT partition t of both 
encoders is given by 

^ = {M^rir. :(Ti,r2)Grixr2}. (36) 

One knows by Theorem [T] that for any e > 0, there is a (^ = (^(e) such that for large n, there is a codecM 

(n, A-f^"' , m}"^ , M^"^ , 2-"'^) for (W, t, i, r) with 

<>iiogAf(")>i?:,-£. 

For fixed n, let such a codecM have the form 



(37) 



(38) 
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In IIII-B31 we will show that if n is large enough, one can find Mi, M2 and Vi, V2 such that ( [32] l is satisfied for 
J/ = 1 , 2 and such that 

1 . . ^ ^3^^ 

(40) 

(41) 

(42) 
(43) 

Because of the validity of ( |32] |. one can carry out the construction of the conferencing functions described in llll-Bll 
As noted in (|35] ). the pair (gi, 32) defined in (|34] | for i^ = 1, 2 is an admissible pair of conferencing functions. By 
(HTt -fiTt. one can naturally consider the set [1, /ii] x [1, ^2] as a subset of [1, Mq ] and the sets [1, ^u] as equal to 
[1, Mu ]. With (.91,52) and recalling the alternative definition of codescoNF right after Definition |5] one can now 
define a codecoNF by a family as in (|7]l as follows: assume that j G [1, Mi] and fc G [1, M2] have a representation 
{ii,j') and (12, fc') as in ( l33T l. Then 

xir := i;i:i,,M (44) 





n 


log Ail 


<Ri 






1 
n 


logA*2 


<R2 




Mo 
2|Ti| 


n) 

|7^ 


- < /ii/l2 < A'C\ 








a = 


Af,("), 








6 = 


Mf). 



-rir2 
f(n)l 



yr-ya:iu" (45) 



where (ii, 12) is to be considered an element of [1, M^ ]. The decoding sets are defined as 

Fjk -^ F{iu22)j'k'- (46) 

This code is a codecoNF(?i, Mi, M2, Ci, C2) for the compound MAC with conferencing encoders as in Definition 
|5] because it satisfies ([8]) and ^ for the pair of conferencing functions (51, 32)- We now show that it also achieves 
the desired rates. Without loss of generality, one can assume that 

- log(Ai, - 1) > - log Ai. - ^ A ^ (47) 

n n 4 2 

if /i,y > 1. We may also assume that 

ilog(2|Ti||T2|)<e. (48) 

Ti- 
lt follows for large enough n from ( [37] i and (|4T]i-(|48]l and the definition of the R'^ that 



- \0gM1M2 > - log(M^")M(")Af (")) - - log(2|Ti||T2|) - | A e 
n n n 2 



(49) 



>Ri+R2- 5£. 
Further by ^, gOl), dHI, gill, and (O, for ;/ = 1, 2, 



- log M^ < - log n^ + - log £,u<Rv + K = Ru. (50) 

n n n 
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Combining ( |49] l and ( |50l ) yields 



1 



- log AU >Ru- 5e, 
n 

SO the rates are as desired. In Subsection IIII-B4I the average error of this codecoNF(^, -^^i, Af2, C*!, C2) will be 

shown to be small, thus finishing the proof of the achievability part of Theorem |2] 

3) Finding Mi,M2,Vi,V2 for Ci,C2 > 0; Let a positive integer n be fixed. Without loss of generality, let 

< £ < 7?i A R2, so again without loss of generality, one can assume Ri < (1/n) logA/Q . We choose 



Vi = \Ti 



onRi 



and 



V2 = IT, 



mi 



Hence (|32] | and ([39]l-(l4ni are always satisfied. In order to find Mi and M2, three cases need to be distinguished. 
In all of the cases, it is straightforward to check that (|42] | and ( |43] l hold. 

Case 1: R^ = R^ for 1/ = 1,2. Then R'^ = R'^ ^ 0. Set M^ = V^/\T^\ for i/ = 1,2. Then /x^ = M^, so 

fi = 6 = 1- 

Case 2: R^, — Ci, for v — 1,2. Choose A/^ such that 



AU - 1 



= Ad") 



.K/l^.l-i. 

for i^ = 1, 2. Then ^i^ = K/|r^| and S,u = Af^"^ 

Ca^e 3a: Ri = Ci, ^2 = R2- Then i?2 = and R2 < C2. Choose A/2 = V2/IT2I and Afi such that 



ii 



Mi-1 



= m\^K 



.Vi/lTil-l. 
Then ^i^^Vy/\T^\ for both z^, and note that ^1 = Af|"' and 6 = 1- 

Case 3b: R2 = C2, i?i = i?i. Analogous to case 3. 

4) The average error for Ci,C2 > 0: Recall the form dSSll of the codecM(n-, ^4"^ *^l"^ ^'^2"\ 2""'^) and 
the definitions (I44b-(l46b of the codecoNF(?i, A/i, Af2, Ci, C2) in IIII-B21 We now bound its average error. Let the 
channel state VF G W be arbitrary. The codecM satisfies 



1 



Afi")A^^)A/^) 



J2 W-{Ft^,,,m,,yl,,)<2- 



nC 






where the sum ranges over the message set [1, Mq ] x [1, A/}" ] x [1, M2 ] of the codecM- With assumption (l47T i 
and (|4l]i-(|43]i, one has 

M1M2 > 2-"(';/2)+i|ri||r2|Af^")Af(")Af2("\ 

One thus obtains for the average error of the codecoNF("^j Mi, M2, Ci, C2) that 
1 



M1M2 



E 



M/"(^,|xJ„yJ,) 



< 



j;/ce[i,A/i]x[i,M2] 
2"(C/2) 1 



E 



< 



2|ri||r2| M^^'^MrMr („, („) („, 

2-n(C-C/2) 

2|ri||r2| 



t¥"(i^,fe,|ijv,,ylfc,) 
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This proves that the average error of this codecoNF('T^, A^i, -^2, C*!, C2) is exponentially small. Thus the rate pair 
{Ri, R2) is achievable, and this finishes the proof of the achievability part of Theorem |2] for the case Ci, C2 > 0. 

5) The case Ci > 0, C2 = 0; First note that the case Ci = 0, C2 > is analogous to the case Ci > 0, C2 = 
which is treated here. One can use all the methods used in the case Ci , C2 > for the first user An admissible one- 
shot Willems conferencing function gi can be constructed as in llll-Bll Then let (i?i, R2) G Cqq^^ i(W, ^1,^2, Ci). 
One checks that the triple {R'q,R[,R2) defined as in lIII-B2l is contained in CcM(VV,ii,i), where t also is defined 
as in IIII-B2I Given a blocklength n, one then can find Af 1 , A/2 , Vi , V2 as in IIII-B31 where only the relevant cases 
need to be considered. This then defines a good codecoNF- 

6) Convexity and Bound on \U\: The convexity of Cconf(VV, ii, ^2, ''i Ci, C2) is inherited from the convexity 
of CcM(W,ii,i2, J^)- Also the bound on the cardinality of the set U appearing in the parametrization of the rate 
regions comes from the bound on the range of the auxiliary random variable appearing in the parametrization of 
the capacity region of the compound MAC with common message. 

IV. The Converses 

We will concentrate on the converse for the MAC with conferencing encoders because it requires some non- 
standard preliminaries. For the converse for the MAC with common message, we only show how to start the proof, 
the rest is similar to the proof of the MAC with conferencing encoders. For both outer bounds, one assumes perfect 
CSIR. As we will prove that, fixing a pair of CSIT partitions, this outer bound coincides with the inner bound with 
no CSIR, this includes all possible permissible types of CSIR. 

A. The Converse for the MAC with Conferencing Encoders 

First we define what we mean exactly by the statement that a weak converse holds for (W,ii,i2, ?') with 
codescoNF. 

Definition 9. A weak converse holds for the compound MAC (W, ii, i2, f) with codescoNF if the average error A 
of every codecoNF(?i, -^^1, Af2, ^1,(72, A) whose rate pair ((l/n) logA/i, (l/n) log Af2) is further than e > from 
^confC^i ii, ^2, Ci, C2) satisfies A > A(e) > if n is large enough. Without loss of generality we measure distance 
in the £^-norm, so the statement that the rate pair of the code is further than e from Cconf(^i ^1, ^2, C'l, C2) can 
be formulated as 



mm 

(i?l,fl,2)GC*o^(W,ti,*2,Ci,C2) 



- log Af 1 - Ri 
n 



- log A/2 - R2 
n 



>e. (51) 



In llV-All we show that the weak converse for the compound MAC with conferencing encoders is implied by the 
weak converse for an auxiliary compound MAC with different CSIT and a slightly restricted kind of cooperation. 
CSIR will also be assumed to be perfect for that channel. In IIV-A2I we then show that the weak converse holds 
for this auxiliary MAC. Throughout the section, we will assume that Ci,C2 > 0. The case of one conferencing 
capacity being equal to zero is treated analogously. 
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1) An Auxiliary MAC: We now describe the auxiliary MAC. Let (W,ii,i2,^) be given. As we assume perfect 
CSIR, we may assume r = {{W] : W e W}. Let ti,t2 be CSIT partitions as in O and define the CSIT 
partition t as in ( |36] |. Let the channel {yV,t,t,r) be given (symmetric CSIT!). We now define what we mean by a 

codeAux(«, Ml, M2, Ci, C2) for (W, t, t, r), where n, Mi, M2 are positive integers and Ci, C2 > 0. 

Definition 10. A codep,\jx{n,Mi,M2,Ci,C2) is a quadruple (/i,/2,5, *&) of functions which satisfy 

/i:[l,Mi] xfxrixT2^A'", 

/2:[l,Af2] xfxrixT2^3;", 
g:[l,Afi]x[l,M2]^f, 

l':Z"xi?^[l,Afi] X [1,M2], 

where F is a finite set and where g satisfies 

-log\\g,\\<C2 for all J e[l, Ml], (52) 

n 

-log||5fc|| f^C-i forallfce[l,Mi] (53) 

n 

for the functions gj and gk defined by gj{j, k) — gk{j, k) — g{j, k). The number n is called the blocklength of the 

code. 

Thus an auxiliary code is one where only messages are exchanged, and where this is done independently of the 
CSIT. As codescoNF, every codeAux can also be described by a family analogous to (|7]i and a conferencing MAC 
like the g from the above definition. 

Definition 11. The codtpxix{n, Mi, M2.,Ci,C2) is a codeAux(", Afi, -^2, C'l, (52, A) if 

In Subsubsection IIV-A2I we will show a weak converse for (W,i,i, r) with codesAux: 
Lemma 3. Let a codep^uxin, Afi, M2, Ci + (5, C2 + S, A) be given with 

> e (54) 

for some e > 0. Then there is a \{e, 5) > Q such that A > A(£, 5) for sufficiently large n. 



mm 

(-Ri,fl2)eC*o^p(W,t,t,Ci+5,C2+i5) 



1 






1 




-logMi- 


-^1 


+ 


-logMi- 


-U^ 


n 






n 





We will show this lemma in IIV-A2I This together with the next lemma shows a weak converse as claimed in 
Theorem |2] 

Lemma 4. For every d > there exists a positive integer uq such that for every n > uq and every codecoNF 
(n. Ml, M2, Ci, C2, A) for (W, ii, i2, r) there is a codcAuxin, Mi, M2, Ci + (5, C2 + S, A) for (W, t, t, r). 
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Deduction of the weak converse for Theorem ^from the weak converse for the auxiliary MAC: Before 
proving Lemma |4] we show how it impHes a weak converse for {'W,ti,t2,r) with codescoNF- Assume that 
the codecoNF("-,-^^i, A'^2, C*!, C2, A) for {W,ti,t2,r) satisfies ( BTl i. Let (5 > be arbitrary. By Lemma |4] for 
this codecoNF, there is a codeAux(ri, Afi, M2, Ci +S,C2 + S,X) for (W,i,t,r). As C^o^p{W,ti,t2,Ci,C2) = 
CQQ(^p(W,i, t, Ci, C2) for all Ci,C2, Lemma |3] implies that A > X{e,S) > for large n. This implies the desired 
weak converse for {W,ti,t2,r) with codescoNF- ■ 

Proof of Lemma^ Let a codecoNFl"-! ^'^1, -^^27 C*!; C'2, A) for {W,ti,t2,r) be given which has the form ^ 
and which uses the conferencing function g. Without loss of generality, assume that 

-iog|ri||r2|<J. (55) 

n 
Set f := r X Ti X T2 and define 

tttit2 : r — s- r 

to be the projection of f onto F x {(ti, r2)}. Further, define a conferencing MAC g : [1, AIi] x [1, A/2] ^ F by 

9iJ,k) = igij,k,Tl,T2))(^r^.T2)<ET^xT2■ 

As g is the conferencing MAC of the codecoNF(f^, Mi,M2, Ci, C2, A), one obtains 

^log\\g,\\ < C2 + S forallje[l,Mi], 

-\og\\gk\\<Ci+S forallfce[l,M2]. 

n 

This together with ( fSSl ) implies that g is admissible for a codcAux with conferencing capacities Ci, + 6. Further 

set x^^^^ := x'^A^^ and yA^'^ := ylj,^^ and Fjk := Fjk- One checks immediately that the code thus defined is a 

codepjjx{n, Ri,R2,Ci + S, C2 + S, A) for (W, t, t, r). This proves the lemma. ■ 

2) The Weak Converse for the Auxiliary MAC: Here we prove Lemma |3] Let S > he arbitrary and set 

C,y :— C„ + S. 

Let a codeAux('T-, Mi,M2, Ci, C2, A) be given which satisfies ( l54b . We must show that there exists a A(e, 5) such 
that A > X{e,S) for large n. 

Assume that the above codeAux('^, Mi, M2, Ci, (72, A) has the form 

{i^V^r.r^F^k ■■ (n,r2,p) e Ti X 7^2 X R}, 

and uses the conferencing MAC g. We may assume that A < 1/4, because otherwise, we are done. Consider a 
probability space (17, J^, P) on which the following random variables are defined: 

• (<S'i,'S'2) is uniformly distributed on [1, A/i] x [1,^/2], 

. G = g{Si,S2), 

« for each t == (n, r2) G Ti x T2, 
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. for each W e Wr a Z^ taking values in Z" such that for x G A"", y G y'\ {j, k) e [1, Mi] x [1, Afa], and 

nz"^ = z|X- = X, r- = y, Si = i, 52 = fc, G = 7] - W^"(z|x, y). 
Fix a T e Ti X T2 and a M^ G Wt- By Fano's inequality, 

H(5i, ^slZ'^) < Alog(MiM2 - 1) + h{\) =: Ai, 
where h denotes binary entropy. By the chain rule for entropy, 

Ai > H{Si, S2\Z^) > H{Si,S2\Z^' ,G) > H{Si\S2, Z^ ,G) W H{S2\Si,Z^ ,G). (56) 

(Several rules for calculating with entropy are collected in [4 Chapter 1.3].) Using ( |56] l. Mi can be bounded via 

logMi^H{Si\S2) 

^ I {Si; Z^ ,G\ S2) + H{Si\S2, Z"^ ,G) (57) 

<l{Si;Z^,G\S2)+Ai. 
One obtains an analogous bound on M2, 

logM2 < I {S2; Z^,G\ Si) +Ai. (58) 

For M1M2 one has the bounds 

logMiM2 = H{Si,S2\G) 

= I {Si,S2; Z^, G) + H{Si,S2\Z^, G) (59) 

</(5i,52;Z^,G)+Ai 

and 

\ogMiM2^li{Si,S2) 

^l{Si,S2;Z'^)+H{Si,S2\Z'^) (60) 

<l{Si,S2;Z'^)+Ai. 
Using the chain rule, one splits up the mutual information terms in the bounds (|57]|-(|59]| into two terms each such 
that the channel only appears in the second one: 

l{Si;Z^,G\S2) =/(5i;G|52) + /(5i;Z^|52,G), 
l{S2;Z'^,G\Si) ^ I iS2;G\Si) + I {S2; Z'^l Si,G) , 
l{Si,S2;Z^,G) ^I{Si,S2;G) + l{Si,S2;Z^\G). 

These mutual information terms and the one in ( I6OI 1 are bounded successively in the following. First, the terms not 
depending on the channel are considered. By the properties of g, if the value of ^2 is given, the random variable 
G can assume at most 2"'-^i values, hence 

nSi;G\S2)<Ci. 
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An analogous argument shows 

I{S2;G\Si)<C2. 

Finally, as in Remark |4] one sees that H{G) < Ci + C2, so 

IiSi,S2;G)<Ci + C2. 

Next we treat the remaining mutual information terms. Recall that for {j,k) ^ {j',k'), the corresponding 
codewords do not need to be distinct. This is a problem when Si, S2 are to be replaced by X'^ , V^ in the expressions. 
Define A2 := /i(2A) + 2Alog(MiAf2)- We next show that 

l{Si-Z^\S2,G) </(X";Z^|y",G)+A2, (61) 

l{S2;Z^\SuG) <l{Y^;Z^\X\G)+^2, (62) 

l{SuS2;Z^\G) </(X^y^;Z^|G)+A2, (63) 

/(5i,52;Z^) </(X^y^;Z^)+A2. (64) 

This allows us to do the replacement and to control the error incurred by the replacement. In order to show (|6T]l-(l64]l. 
we write 

/(Z^; Si\S2,G) = H{Z'^\S2, G) ~ H{Z'^\Si, S2,G), 
I{Z^;S2\Si,G) ^ H{Z^\Si,G) - H{Z^\Si,S2,G), 
liZ"^- Si, S2\G) = i/(Z^|G) - i/(Z^|5i, ^2, G), 
/(Z^; Si,S2) = HiZ"^) - if (Z^|5i, ^2). 

One has H{Z^\S2,G) < H{Z^\Y^,G) and H{Z^\Si,G) < H{Z^\X\G), as {X^ ,G) is a function of 
(S'l, G) and iy^ , G) is a function of (52, G). Thus in order to show (|6T]l-(l64]i, we need to bound the distance of 
H{Z^\Si,S2,G) fmm H{Z^\X-',Y-',G) wA of H{Z^\Si,S2) fmm H{Z^\X-' ,Y^). 

Lemma 5. One has 

H{Z^\Si,S2,G) >H{Z^\X'',Y'',G)~A2, 
H{Z^\Si,S2) >H{Z^\X^,Y^)-A2. 

Proof: Note that as G is a function of (S'l, 52), 

H{Z \Si,S2,G) — H{Z , Si, S2,G) ~ H{Si, S2) ~ H{G\Si, S2) ^ H{Z ,Si,S2,G)—H{Si,S2) 

and 

H{Z'^\Si,S2) = HiZ"^, Si, 52) - if (Si, S2). 
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Now (Z^',X^,y^) is a function of {Z^, 81,82) and (Z^,X^,r^,G) is a function of (Z^, 5i, 5*2, G), so one 
has 

H{Z^, 81,82, G) > H{Z^, X\ y, G), 
H{Z^,8i,82) > H{Z^,X^,Y^). 

Hence it suffices to show 

H{8i,82)<H{X\Y^)+l^2- (65) 

Set 

Gw := {(j,fc) : W-{{F]lr\^]„y]^) < 1/2} 

and set Bw := ([1, Mi] x [1, M2]) \ Qw- From 






MiM2^ "" ^"' ' ^'^'•'J'^' - 2M1M2 



it follows that |;Bw| < 2\MiM2. Now if {j,k),{j',k') £ Qw, then (xj^.yj^) 7^ (xj,fc,,yyfe')' because otherwise 
one would obtain a contradiction to the disjointness of Fj^ and -F'^./. We introduce the random variable Q = 
lg^^,{8i, 82) which equals 1 if (51,5*2) G Gw and else. The above bound on the size of Bw implies H{Q) < 
h{2X) for A < 1/2. Therefore 



i?(5i,52)=-ff(5i,52,Q) 

<H{8i,82,Q)-H{Q) + h{2X) 
<H{8i,82\Q) + h{2X). 
The assignment of message pairs to codewords is unique on Qw, so 

iJ(5i,52|g) - H{X\Y^\Q - 1)P[Q = 1] + H{8i, 82\Q = 0)P[g = 0] 
< H{X^,Y^) + 2Alog(MiM2). 



Altogether this shows (165b . and thus the lemma. ■ 

Thus (l6ni-(l64]i is established. The next goal is to obtain a single-letter representation of the right-hand terms in 
(l6Tll-(l64]i. This is done by several applications of the chain rules. Set 

x'^ = {XI, . . . ,xD, 

y^^{y{,...,y:), 

Z^ = «,..., Zf). 



Further, set 



Zir,m]--^iZr,---,Z^) form = l,...,n. 



February 2, 20 1 1 



SUBMITTED TO TRANSACTIONS ON INFORMATION THEORY 



33 



One has 

n 

m— 1 

(yj„ G) is a function of (r^ G, Z[^„_i]), so 

H{Zm I^^^G, Zj;^^^_;^]) <H{Z^ |y^,G). 
Further as the channel is memoryless, 



= -I[^ZZ-.Zl^_,^\X\Y\G)+H{ZZ\X\Y\G) 
= H{Z„^ \X^, Y^, G). 
Hence 

n 

/(x-z>^ir^G)< ^ {i7(zrir;;,G)-if(zri^;;,i;;;,G)} 

m— 1 
n 

m— 1 

In an analogous manner, one shows that 

n 

i{Y^;Z'^\x\G) < J2 i{z^;y:,\x:^,g). 

rri— 1 

Further, with the same arguments as above, 

K^Z'^-X^Y^IG) = J2 {h{z:^\G,zI^^^_,^) - H{Z^\X^,Y\G,Z^^^^,^)} 

m—1 
n 

<J2{h{z^\g)-h{z^\x:;,,y;^,g)} 

m—1 

n 

771 = 1 

Finally, 

n 
m—1 

Now we define the random variables that will be used for the single-letter characterization. Let U take values in 
[l,n] X r, Xr in X, IV in y, and Zw in 2, with 

¥[U = (m,7)J = r^-77 =:po(m,7); 

n M1M2 



P[X^ = .t|C/== (m,7)] 
P[i; = y|(7=(m,7)] 



{(i'fc) :*Jfe,m = 2;} 



{ij,k) ■■9ij,k) =7} 



{(i, fc) : 5(j, k) = 7} 



=:Pi^(x|(to,7)); 
=:p2r(y|(m,7)); 



February 2, 2011 



SUBMITTED TO TRANSACTIONS ON INFORMATION THEORY 34 

and 

T[Zw - z\U = (m, 7), X, = X, Yr^y]^ Ww{z\x, y). 

Note that U := support(po) C [1, n] x f is a finite set, that po G '^(^). that pir £ 1C{X\U) and that p2r e ^{y\U). 
Further, 

Po(m,7)-lp[G = 7], 
n 

Pi,(a;|(m,7))=PK;-x|G-7], 
P2.(2/|(m,7)) = P[y;, = 2/|G = 7]- 

Combining the above equalities and inequahties, this impUes that 

1 1 " 

n n ^-^ ' 

m— 1 

1 1 " 

-I{S2-, Z^l^i, G) < - ^ / (Z^;y^| Xl^, G) ^ IiZw;Yr\Xr, U); 

m— 1 

1 " 

-/(Z^;5i,52|G) < - ^ /(Z^;X,;,r;jG) = J(Z,4^;X,,n|C/); 

7n— 1 

1 1 " 

-/(Z^;5i,52) < - J2 I {Z^;X;,,,y;,) = I{Zw;Xr,Yr). 

?n— 1 

Thus for every t ^ Ti x T2 and every W G Wt, using (|56]|-(|64]| and recalhng the definitions of Ai and A2, 
one has the bounds 

- log Afi < Gi + I{Zw;Xr\Yr, U) + -A; (66) 
n n 

- logAfa < G2 + IiZw;Yr\Xr, U) + -A; (67) 
n n 

ilogAfiAf2 < {{Ci+C2+IiZw;Xr,Yr\U))/\I{Zw;Xr,Yr)} + -A. (68) 

n n 

On the other hand, the vaHdity of (|54] | implies that there is a r e Ti x r2 and a W £ Wr such that one of the 

following inequalities holds: 

- log M, > Gi + I{Zw ■,Xr\Y^,U) + e; (69) 
n 

- log M2 > G2 + I{Zw ■,Yr\Xr,U) + e; (70) 
^logMiM2 > {{Ci+C2+I{Zw;Xr,Y,\U))Al{Zw;X,,Y,)}+e. (71) 

According to which of (l69]l-(l7Tli holds, we distinguish between three cases. In order to simplify notation, we write 

{Ci+C2+I{Zw;X^,Y^\U))AliZw;Xr,Yr)^:lQ. 

Case 1: (fTTT l holds. Then comparing ( fTlT i with ( l68T l yields 

/o + flog2 



1-2A< 



lo+s 
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But if A is chosen small enough, this gives a contradiction if n is large depending on e and A. Thus for small 
A = A(e) and large n = n(A, e), there can be no codecoNF(?^, Afi, -^-^2, C*!, ^2), A) satisfying ( ItTI ). 

Case 2: dTTT l does nof hold, but ( |69] l holds. Together with ( |66] |, the fact that dTTI ) does not hold implies 

- \ogMi <Ci+ I{Zw;Xr\Yr, U) + ^^^ + 2A(/o + e). 
n n 



Then using ( |66] |, we obtain 

e log 2 



A> 



2(/o + e) n{Io + e) ' 

Thus for large n, there can be no codeAux("-, Mi, M2, Ci, C2, A) satisfying (|69l ) if A is too small. 
Case 3: dTTT l does nof hold, but dTOl l holds. Analogous to case 2. 
And this proves the weak converse for the auxiliary MAC. 

B. The Converse for the MAC with Common Message 

We restrict ourselves here to describing the setting that is the starting point for the weak converse and apply 
Fano's inequality. The rest is single-letterization of mutual information terms and similar to what was done in 
IIV-A2I We assume full CSIR again. 

For a A > 0, let a codecM ("■, Afoj ^^1, -^2, •^) be given with the form ^ and conferencing MAC g. Let a 
probability space (J7, J^, P) be given on which the following random variables are defined: 

> 5*0 uniformly distributed on [l,Afo]i 

> 5*1 uniformly distributed on [1, Afi] given 5*0 and ^2 uniformly distributed on [1, Af2] given Sq, 
m for every (ri,r2) G Ti x T2, 

~ ij ' ~ •'ik ' 

• for every W G Wr a random variable Z"^ such that 

P [Z^ = z|X^i = X, Y-- = y, 5o = ^, Si = j, S2 = k] = W'"(z|x, y) 

for all xe A'",y G 3^". 
If VF G Wt, the definition of the codecM and Fano's inequality imply 

Alog(MoMiAf2 - 1) + h{X) > H{So, Si, S2\Z^) 

>H{Si,S2\Z'^,So) 
>H{Si\Z'^,So,S2)VH{S2\Z'^,So,Si). 

From this point, replacing the message variables by the codeword variables and the single-letterization are very 
similar to the one done in the converse for the MAC with Conferencing Encoders, so we omit them. Thus the weak 
converse for Theorem [T] is proved. 
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{j ,g{j .k,T,T^)) 



(7, yt)— ►SPLITTER 



{k,g{j,k,T,T^)) 



BSl 



BS2 



[j,k) 



W^{c\a,b):ri&H 



MOBILE 



Fig. 3. A central node distributing one data stream to two senders. 

V. Application and Numerical Example 
A. Applications in Wireless Networks 

It was noted in the Introduction that the information-theoretic compound MAC with conferencing encoders can 
be used to analyze "virtual MISO systems". We now give the informal description of a simplified wireless "virtual 
MISO" network which we will then translate into our setting of compound MAC with conferencing encoders. 
Assume that one data stream intended for one receiving mobile terminal is to be transmitted. Two base stations, 
which are placed at spatially remote positions, are used to send the data to the destination. Assume that the base 
stations are fed by a central network node with their part of the information which is to be transmitted. At the 
receiver, the two streams received from the two base stations are then combined to form the original data stream. 
The question arises how the original data stream should be distributed by the central node in order to achieve a 
good performance. We will assume that the central node has the combined CSIT of both transmitters, which could 
for example be achieved by feedback. The network is pictured in Figure [5] 

The answer to this problem can be given immediately once one has translated the question into the setting of 
compound MAC with generahzed conferencing. If the data stream is not split at all, but both senders know the 
complete message and also have the other transmitter's CSIT, then the full-cooperation sum capacity is achieved, 
i.e. the capacity of the system where the senders and the central node are all at the same location. The drawback of 
this scheme is that the capacity of each of the links from the central node to the base stations must be at least the 
full-cooperation sum capacity. The other extreme is if the central node just splits each message from the data stream 
into two components. Then the overhead which needs to be transmitted to the corresponding sender in addition to 
its message component is minimized. However, the full-cooperation sum capacity will not be achieved in general. 
The goal should be to find the minimal amount of overhead which suffices to achieve a good performance. 

From Theorem |2] it follows that it suffices for the splitter to send to the first base station, in addition to the first 
component of the message, the one-shot WiUems conferencing function value attained by the message of the second 
component and the second sender's CSIT. The analogous statement holds for the overhead for the second sender. 



February 2, 2011 



DRAFT 



SUBMITTED TO TRANSACTIONS ON INFORMATION THEORY 



37 



The sum of the overhead rates required to achieve the full-cooperation sum capacity can be seen from Corollary 
[U See also the following numerical example. 



B. Numerics 



We present a simple example of a rate region for the MAC with conferencing encoders. Assume X = y 
{0, 1}. Let W consist of the stochastic matrices 



/ 



Wi = 



V 



.9 


.l\ 






(.9 


.1 


.4 


.6 


and 


W2 = 


.6 


.4 


.6 


.4 






.4 


.6 





li 






.0 


1 



Here, the output distribution corresponding to the input combination {x, y) is written in row 2x + y + 1. 

In Figure |4] different capacity regions are pictured. Wi and W2 denote the capacity regions of the MACs given 
by Wi and W2, respectively, without cooperation. Their intersection is the capacity region of the compound channel 
consisting of Wi and W2, where the exact channel is known at the transmitter. The capacity region in the case 
of no CSIT is shown for no cooperation (Cn = 0,Ci2 = 0). Note that absence of cooperation makes the region 
strictly smaller. C21 and C22 have been chosen such that their sum is the minimal Ci + C2 achieving the optimal 
sum capacity: 



(^21/ 



1 



(C° 



max mm I {Z,:X,Y\U)) 
M i=1.2 ^ ' ' ' '^ 



.29 



C31 = .33 has been chosen as .1 minus the minimal Ci such that the first user achieves the maximal possible rate, 
and C32 = .43 has been chosen as the minimal C2 such that the second user achieves the maximal possible rate. 
Finally, "full coop." denotes the rate region which can be achieved by full cooperation. As noted in Corollary [T] it 
can already be achieved with Ci — .47 and C2 ~ .47. 

VI. Conclusion and Outlook 

We have derived the capacity regions of two information-theoretic compound multiple access channels: the 
compound multiple-access channel with common message and the compound multiple-access channel with con- 
ferencing encoders, where conferencing can be done about messages and channel state information. The channel 
with common message, aside from the interest it has on its own, was used to derive the capacity region of the 
channel with conferencing encoders. The latter channel can be applied in the rigorous information-theoretic analysis 
of certain wireless cellular networks which use base station cooperation in order to transmit data to one mobile 
receiver One can derive the exact amount of base station cooperation that is needed in order to achieve the sum 
capacity and the capacity region as would be achievable if the base stations were at the same location and could 
thus be regarded as forming a "virtual MISO system". 

This analysis was motivated by recent developments in the design of cellular systems. As interference is the 
main limiting factor in the performance of such systems, research has recently focused on methods of controlling 
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Fig. 4. The capacity regions for the conferencing capacity pairs (Cn, C12) = (0, 0), (C21, C22) = (-29, .29), and (C31, C32) = (.33, .43). 



interference in order to meet the requirements for future wireless systems such as LTE-Advanced. Much of the 
literature which has contributed to this research uses strict assumptions that will not generally be met in reality. 
Assuming limited base station cooperation and channel uncertainty in this paper, we tried to obtain a more 
appropriate description of real situations. 

Note that we did not address the issue of unknown out-of-network interference. This is a problem for real networks. 
Different systems operating in the same frequency band and operated by different providers who do not jointly 
design their systems interfere each other. This happens, e.g., when Wireless Local Area Network (WLAN)-systems 
are located close to each other. Future work will be to model this information-theoretic ally. The appropriate model 
is to take multiple-access channels with conferencing encoders. However in this case, channel uncertainty should 
not be included by considering a compound channel, but rather, the model best describing reality is the arbitrarily 
varying channel. In such a channel, the transmission probabilities can change for each channel use in a way unknown 
to the encoder (This is just the way unknown interference acts on channels.) Ahlswede's robustification technique 
|[T] shows how to construct codes for arbitrarily varying channels from codes for compound channels. Hence from 
that point of view, the work done in the present paper can also be regarded as a preliminary needed for the analysis 
of arbitrarily varying multiple-access channels with conferencing encoders. 

Appendix 
Here, we include some technical lemmas concerning typical sequences. 
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Lemma 6. a) Let X he a finite set. Let p,p e ViX). Let < 5 < 1/{2\X\). Then, for all n £ N, for every 

(pi is a universal function (i. e. independent of everything), positive if \X\ > 1 and < S < 1, and for all values 
of \X\, one has lim^-^o 4'i{\'^\t^) — 0- 

b) Let X,y be finite sets. Let p G V{X) and W,W stochastic matrices with input alphabet X and output 
alphabet y. Let < S < 1/(2|A'||3^|). Let f e P{X x 3^) be the joint distribution corresponding to p and W. 
Then, for all neN, for all (x, y) e T^^g, 

02 is a universal function (i. e. independent of everything), positive if \X\^\y\ > 1, < 5 < 1, and for arbitrary 

\X\, \y\, one has lim^^o M^. I-^I, 1^1) = . 

Proof: This is essentially |4, Lemma L2.6 and 1.2.7]. ■ 

Lemma 7. Let X be a finite set and let p e V{X). Then, there is a universal constant c > such that 

Proof: This is exactly lfT4l Lemma III. 1.3] ■ 

The next lemma is not used in the text. However, it is used in the proof of Lemma |9] which we will prove. For 
X G A"" and W G yV(>'|A'), denote by ^^^(x) the set of y G y- that ai-e Ty-generated by x with constant 5 (cf. 
H Definition 1.2.9]). 

Lemma 8. Let X,y be finite sets. Let p G ViX) and W G W{y\X). Let < S < 1/(2|A'|). Then for any 

03 is a universal function (i. e. independent of everything), positive if \X\,\y\ > 1, < 5 < 1, and for arbitrary 

\X\, \y\, one has lim^^o ^alf^, \X\, \y\) = 0. 

Proof: This is essentially J4I Lemma 1.2.13]. ■ 

The following lemma was already used in |8j. A slightly different form was proved in HI. As it is non-standard, 
we give a proof here. 

Lemma 9. Let W be a nonempty set, and let X and y be finite sets. For every W G W, let W G W(3^|A'). Let 
X G A"" and p G 'P{X). Define the probability measure qw on y x X by 

qw{y,x) = W{y\x)p{x). 
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Then, 



^ 2n(sup„^„ H{W\p)+^{5)) 



for some universal positive function f/; which tends to as 5 ^f Q. 

Proof: Denote by T{W) all the joint types q'my x X such that there is an VF G W with 

\qw{y,x) -q{y,x)\ < 6. 
Every T" g can be written as the union of some T--, where q e T(W). Hence 



(72) 



I I T" 



5|x 



W£W 



< 



U T- 

qeT{W) 



(73) 



(74) 



As there are at most (n + l)l'^lli'l different joint types in y x X"\ this is smaller than 

(n + l)!-^!!^! max |r^"|x| ■ 
qeT(W) I « ' I 

Without loss of generality, we can assume that the union on the left side of ( l73b is nonempty. Hence there is an 

W E W with T" , ^|x ^ 0, so X G T'"-.y.^. This implies for any qw which is close to q (in the sense of the 

definition of T{yV)) and for all a; G A" and y G 3^ 

\q{y,x) -W{y\x)p^{x)\ 

< \q{y,x) -qw{y,x)\ + \qw{y,x) -W{y\x)p^{x)\ 

< S + W{y\x)\p{x) -p^ix)\ 

<{\y\ + i)s. 

Thus T?|x C T^(|-^n -^s_5(x). By Lemma[8] we conclude, using (JTS)) and ^^, that 



wew 



<(n + 1)1^11^1 sup 2"(^(^IP)-^(*)', 



which finishes the proof. 
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